Sparse multi-class prediction based on the Group Lasso in multinomial logistic regression
نویسندگان
چکیده
2 3 Preface Many classification procedures are based on variable selection methodologies. This master thesis concentrates on continuous variable selection procedures based on the shrinkage principle. Generally, we would like to find sparse prediction rules for multi-class classification problems such that in increases the prediction accuracy but also the interpretability of the obtained prediction rules. For these reasons we have chosen for the multinomial logistic regression model as its penalization procedures perform continuous variable selection and generally lead to sparse prediction rules. Next to the multinomial logistic regression model we have also implemented the Ridge penalization, Lasso penalization, Elastic net penalization, and the Group Lasso. The emphasis of this research lies on the Lasso and Group Lasso. The Lasso performs in a multi-class classification problem a variable selection on individual regression coefficients. In the multinomial regression model each predictor has a regression coefficient per class. The selection of the individual regression coefficients is less logical than the selection of an entire predictor. For this reason it could select redundant predictors leading to more retained predictors and less interpretable prediction rules. To overcome this problem we have developed a Group Lasso procedure with a novel group structure. The advantage of using the Group Lasso is that it performs variable selection on the predefined groups. In our model we developed a group structure which groups all the regression coefficients, i.e. of each class, of each predictor. This group structure facilitates the selection of an entire predictor. We demonstrate on the basis of gene expression profiles of 531 well-characterized Acute Myeloid Leukemia patients that the Group Lasso obtains less predictors with a similar prediction accuracy when compared to the regular Lasso. Finally, the Group Lasso facilitates the comparison of regression coefficients between classes for each predictor, which is not possible with the Lasso. Acknowledgement I would express my appreciation and gratitude to Jelle Goeman, who has been my mentor and supervisor during this study. I would like to thank him for giving me the opportunity to work at such a gentile and successful department. He has the uncanny ability to remain critical and explain difficult problems in way that I could understand them. I look forward to the many projects we could work on together. Next, I would like to express my gratitude to Marcel Reinders, my supervisor during the Master course. Marcel has the great ability to translate problems and descriptions, stemming …
منابع مشابه
GAP Safe screening rules for sparse multi-task and multi-class models
High dimensional regression benefits from sparsity promoting regularizations. Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be safe. In this paper we derive new safe rules for generalized linear models regularized with `1 and...
متن کاملRobust Inference on Average Treatment Effects with Possibly More Covariates than Observations
This paper concerns robust inference on average treatment effects following model selection. In the selection on observables framework, we show how to construct confidence intervals based on a doubly-robust estimator that are robust to model selection errors and prove that they are valid uniformly over a large class of treatment effect models. The class allows for multivalued treatments with he...
متن کاملSparse Bayes estimation in non-Gaussian models via data augmentation
In this paper we provide a data-augmentation scheme that unifies many common sparse Bayes estimators into a single class. This leads to simple iterative algorithms for estimating the posterior mode under arbitrary combinations of likelihoods and priors within the class. The class itself is quite large: for example, it includes quantile regression, support vector machines, and logistic and multi...
متن کاملMultinomial classification with class-conditional overlapping sparse feature groups
Regularized multinomial logistic model is widely used in multi-class classification problems. For high dimension data, various regularization methods achieving sparsity have been developed and applied successfully to many real-world applications such as bioinformatics, health informatics and text mining. In many cases there exist intrinsic group structures among the features. Incorporating the ...
متن کاملGap Safe Screening Rules for Sparsity Enforcing Penalties
In high dimensional regression settings, sparsity enforcing penalties have proved useful to regularize the data-fitting term. A recently introduced technique called screening rules propose to ignore some variables in the optimization leveraging the expected sparsity of the solutions and consequently leading to faster solvers. When the procedure is guaranteed not to discard variables wrongly the...
متن کامل